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ABSTRACT 

An efficient algorithm for adaptive kernel smoothing (AKS) of two-dimensional imaging 
data has been developed and implemented using the Interactive Data Language (IDL). The 
functional form of the kernel can be varied (top-hat, Gaussian etc.) to allow different weight- 
ing of the event counts registered within the smoothing region. For each individual pixel the 
algorithm increases the smoothing scale until the signal-to-noise ratio (s.n.r.) within the kernel 
reaches a preset value. Thus, noise is suppressed very efficiently, while at the same time real 
structure, i.e. signal that is locally significant at the selected s.n.r. level, is preserved on all 
scales. In particular, extended features in noise-dominated regions are visually enhanced. The 
ASMOOTH algorithm differs from other AKS routines in that it allows a quantitative assess- 
ment of the goodness of the local signal estimation by producing adaptively smoothed images 
in which all pixel values share the same signal-to-noise ratio above the background. 

We apply ASMOOTH to both real observational data (an X-ray image of clusters of galax- 
ies obtained with the Chandra X-ray Observatory) and to a simulated data set. We find the 
ASMOOTHed images to be fair representations of the input data in the sense that the residu- 
als are consistent with pure noise, i.e. they possess Poissonian variance and a near-Gaussian 
distribution around a mean of zero, and are spatially uncorrelated. 

Key words: techniques: image processing - methods: data analysis - methods: statistical 



1 INTRODUCTION 

Smoothing of two-dimensional event distributions is a procedure 
routinely used in many fields of data analysis. In practice, smooth- 
ing means the convolution 

I' if) = J(r) <g>/C(r) = / K(f-r')I(f')df' (1) 

(/>>*<-i) 

of the measured data 1(f) with a kernel function AC (often also 
called 'filter' or 'window function'). Although the raw data may 
be an image in the term's common meaning [i.e. the data set can 
be represented as a function I(x,y) where I is some intensity, and 
x and y are spatial coordinates], the two coordinates x and y can, 
in principle, describe any two-dimensional parameter space. The 
coordinates x and y are assumed to take only discrete values, i.e. 
the events are binned into (x, y) intervals. The only requirement on 
/ that we shall assume in all of the following is that / is the result 
of a counting process in some detector, such that I(x,y) G INo- 
An image, as defined above, is a two-dimensional histogram 



and is thus often a coarse representation of the underlying proba- 
bility density distribution (e.g. Merrit & Tremblay 1994, Vio et al. 
1994). However, for certain experiments, an unbinned event dis- 
tribution may not even exist - for instance if the x and y values 
correspond to discrete PHA (spectral energy) channels. Also, some 
binning can be desirable, for instance, in cases where the dynamic 
range of the data under consideration is large. If the bin size is suffi- 
ciently small, the unavoidable loss of spatial resolution introduced 
by binning the raw event distribution may be a small price to be 
paid for a data array of manageable size. 

Smoothing of high-resolution data is of interest whenever the 
signal (defined as the number of counts per pixel above the ex- 
pected background) in the region of interest in x — y space is low, 
i.e. is less than or of the order of 10, after the raw event distribution 
has been sorted into intervals whose size matches approximately or 
exceeds the instrumental resolution. It is crucial in this context that 
the observed count statistics are not taken at face value but are cor- 
rected for background, which may be internal, i.e. originating from 
the detector (more general: the instrumental setup), or external. If 
this correction is not applied, the observed intensity (counts) distri- 
bution I(x,y) may be high across the region of interest, suggesting 
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good count statistics, even if the signal above the background that 
we are interested in is actually low and poorly sampled. The statis- 
tics of the observed counts alone can thus be a poor indicator of the 
need for image smoothing. 

Rebinning the data set into larger, and thus fewer, intervals 
improves the count statistics per pixel and reduces the need for 
smoothing. This is also the basic idea behind smoothing with a ker- 
nel of the form 



JC(f 



V(™ 2 ) 




where \f | 
elsewhere 



< cr 



(2) 



(circular 'top-hat' or 'box-car' filter of radial size a), the only dif- 
ference being that smoothing occurs semi-continuously (the step 
size being given by the bin size of the original data) whereas re- 
binning requires an additional phase information [the offset of the 
boundaries of the first bin with respect to some point of reference 
such as the origin of the (x, y) coordinate system]. However, when 
starting from an image binned at about the instrumental resolution, 
both rebinning and conventional smoothing share a well known 
drawback, namely that any improvement in the count statistics oc- 
curs at the expense of spatial resolution. 



2 ADAPTIVE KERNEL SMOOTHING 

Although conventional smoothing algorithms usually employ more 
sophisticated functional forms for the kernel than the above 'top- 
hat' filter (the most popular probably being a Gaussian), the prob- 
lem remains that a kernel of fixed size is ill-suited for images that 
feature real structure on various scales, some of which may be 
much smaller or much larger than the kernel size. In such a situ- 
ation, small-scale features tend to get over- smoothed while large- 
scale structure remains under- smoothed. Adaptive-kernel smooth- 
ing (AKS) is the generic term for an approach developed to over- 
come this intrinsic limitation by allowing the kernel to vary over 
the image and adopt a position-dependent 'natural' size. 

AKS is closely related to the problem of finding the optimal 
adaptive kernel estimator of the probability density distribution un- 
derlying a measured, unbinned event distribution. The advantages 
of adaptive kernel estimators for the analysis of discrete, and in 
particular one-dimensional, astronomical data have been discussed 
by various authors (e.g. Thompson 1990; Pisani 1993, 1996; Mer- 
ritt and Tremblay 1994; Vio et al. 1994). An overview of adap- 
tive filtering techniques in two dimensions is given by Lorenz et al. 
(1993). 

A common feature of all non-parametric adaptive kernel algo- 
rithms is that the 'natural' smoothing scale for any given position is 
determined from the number of counts accumulated in its immedi- 
ate environment. Following the aforementioned principle, smooth- 
ing occurs over a large scale where few counts have been recorded, 
and over a small scale where count statistics are good. AKS al- 
gorithms differ, however, in the prescription that defines how the 
amplitude of the local signal is to be translated into a smoothing 
scale. 

A criterion widely used for discrete data is that of Silverman 
(1986). It determines the size, a, of the local kernels relative to that 
of some global (i.e. non-adaptive, fixed) kernel (cr CO nst) by introduc- 
ing a scaling factor which is the inverse square root of the ratio of 
the globally smoothed data to their logarithmic mean. For images, 
and using the same notation as before, this means 



cr(f) 



^c'onst^)}!. 



(3) 



where log 10 (Ie Onst (r))i O g = (log 10 /const (r)), and /^(r) repre- 
sents the convolution of the measured data with a kernel of fixed 
size aconst. However, whether or not this approach yields satisfac- 
tory results depends strongly on the choice of the global smooth- 
ing scale aconst (Vio et al. 1994). In the context of discrete data 
sets, Pisani (1993) suggested a least-squares cross-validation pro- 
cedure to determine an optimal global kernel size in an iterative 
loop. However, for binned data covering a large dynamical range 
(see Section[4]for an example), the dependence of the result on the 
size of the global kernel becomes very sensitive indeed, and the 
iteration becomes very time-consuming. Also the dependence on 
the somewhat arbitrary scaling law (eq.|3| remains. Other adaptive 
filtering techniques discussed recently in the literature include the 
H FILTER algorithm for square images (Richter et al. 1991, see also 
Lorenz et al. 1993) and the AKIS algorithm of Huang & Sarazin 
(1996). Closely related are image decomposition techniques in- 
cluding wavelet-based algorithms (Starck & Pierre 1998, and refer- 
ences therein) and adaptive binning (e.g., Sanders & Fabian 2001, 
Cappellari & Copin 2003, Diehl & Statler 2005). 

In the following, we present ASMOOTH, an AKS algorithm for 
images, i.e. binned, two-dimensional datasets of any size, which 
determines the local smoothing scale from the requirement that the 
above the background-corrected signal-ro-noise ratio (s.n.r.) of any 
signal enclosed by the kernel must exceed a certain, preset value. 
The algorithm is similar to AKIS (Huang & Sarazin 1996) in that 
it employs a signal-to-noise ratio (s.n.r.) criterion to determine the 
smoothing scale 1 . However, other than AKIS, ASMOOTH does not 
require any initial fixed-kernel smoothing but determines the size 
of the adaptive kernel directly and unambiguously from the un- 
smoothed input data. ASMOOTH also goes beyond existing AKS 
algorithms in that its s.n.r. criterion takes the background (instru- 
mental or other) of the raw image into account. This leads to sig- 
nificantly improved noise suppression in the case of large-scale fea- 
tures embedded in high background. Our approach yields smoothed 
images which feature a near-constant (or, alternatively, minimal) 
signal-to-noise ratio above the background in all pixels containing 
a sufficient number of counts. In contrast to most other algorithms 
which require threshold values to be set (e.g., for the H coefficients 
in the case of the H FILTER technique), ASMOOTH is intrinsically 
non-parametric. The only external parameters that need to be speci- 
fied are the minimal and, optionally, maximal signal-to-noise ratios 
(above the background) required under the kernel. 

The simplicity of the determination of the local smoothing 
scale from the counts under the kernel and an estimate of the lo- 
cal background greatly facilitates the translation of the smoothing 
prescription into a simple and robust computer algorithm, and also 
allows a straightforward interpretation of the resulting smoothed 
image. 



3 DESCRIPTION OF THE ALGORITHM 

A SMOOTH adjusts the smoothing scale (i.e. the size of the smooth- 
ing kernel) such that, at every position in the image, the result- 
ing smoothed data values share the same signal-to-noise ratio with 
respect to the background; one may call this the 'uniform signif- 
icance' approach. The only external parameter required by AS- 
MOOTH is the desired minimal s.n.r., r m i n . 

In order to ensure that statistically significant structure is not 
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over- smoothed to an s.n.r. level much higher than r m i n , an s.n.r. 
range can be specified as a pair of r m i n , r max values. Note, how- 
ever, that the maximal s.n.r. criterion is a soft one and, also, is ap- 
plied only at scales larger than the instrumental resolution (which 
is assumed to be similar to, or larger than, the pixel scale); under 
no circumstances will ASMOOTH blur significant pointlike features 
(pixels whose s.n.r. in the unsmoothed image exceeds r m in) in order 
to bring their s.n.r. value down below the r max threshold. 

The background-corrected s.n.r. of any features is computed 
according to one of two definitions. The weaker requirement is 
given by the definition of the "significance of detection" above the 
background, 



(N src - N hkg ) 



(4) 



AiV bkg ' 

where N src and iVbkg are the number of counts under the smooth- 
ing kernel and the background kernel, respectively, and AiV b kg is 
the la uncertainty of the background counts. Alternatively, and by 
default, the more stringent definition of the "significance of the 
source strength measurement" can be used, 

(N src -N hkg ) 



^(AAU) 2 + (AN\ 



(5) 



bkgj 



with AiV src being the la uncertainty of the source counts accumu- 
lated under the smoothing kernel. Either of these definitions assume 
Gaussian statistics by implying that the na error of a measurement 
is equal to n times the la error. 

In addition to the desired s.n.r. limits r m i n ,max, estimates of the 
background Jbkg and the associated background error ATbkg are op- 
tional additional parameter. To allow background variations across 
the image to be taken into account, Jbkg and A/bkg can be supplied 
as images of the same dimensions as the raw image; in the case of 
a flat background Jbkg and A/bkg reduce to global estimates of the 
background and background error per pixel, i.e., single numbers. 
Note that, more often than not, A/bkg / as the background 

estimate will originate from model predictions rather than being 
the result of another counting experiment. If no background infor- 
mation is supplied, ASMOOTH determines a local background from 
an annular region around the adaptive smoothing kernel, extend- 
ing from 3-4a for a Gaussian kernel, and from l-4/3cr for a top-hat 
kernel. 

Internally, the threshold s.n.r. values T mm , r max are translated 
into a minimal and a maximal integral number of counts, Amin, 
iV m ax, to be covered by the kernel. More precisely, the criterion is 
that 



N^ n <l'(r)/JC(d,a(r))<N m 



(6) 



where a(f) is the characteristic, position-dependent scale of the re- 
spective kernel. N m [ n , max in eq.|6]are determined from the definition 
of the minimal and maximal s.n.r. value r m i n ,max, 

-^Vmin^max -/Vbkg 



y/iVmiiwax + AA^ 

where, in analogy to the definition of Ami n ,max (cf. eqs. 11161 . Abkg 
and AiVbkg are the integral number of background counts under the 
respective kernel and the associated error. From eq.[7|follows 



N m 



N hkg 



, 1 2 

i 2 ^min,max 



+ r„ M /iV bkg + AiV4 g + i r^ njnax . (8) 
For an adaptive circular top-hat kernel of size a(f) (cf. eq.|3, eq.|6| 



translates into N m in < 7va(r ) 2 I'(f ) < N max , and the interpreta- 
tion is straightforward: at least Amin, but no more than N ma x, counts 
are required to lie within the area tt a(r ) 2 that the smoothing oc- 
curs over. In the case of a uniform background, the value of ATbkg 
in eq. [8]is simply given by nbkgvr a(r) 2 where ribkg is the global 
background level per pixel in the input image. 

For any given pair of (Ami n , N max ) values, a Gaussian kernel 



JC(r — r a(f )) 



1 



2na(r y 



exp 



2a(f) 2 



(9) 



will yield considerably larger effective smoothing scales than a top- 
hat, as, in two dimensions, more than 60 per cent of the integral 
weight fall outside a 1 a radius, whereas, in the case of a circular 
top-hat kernel, all of Am in needs to be accumulated within a 1 a ra- 
dius. (Note that, according to Eq.|9l it is the weights per unit area 
that follow a Gaussian distribution. The weights per radial annulus 
do not, which is why, for the kernel defined in Eq.|9| the fraction of 
the integral weight that falls outside the 1 a radius is much larger 
than the 32 per cent found for a one-dimensional Gaussian.) Which 
kernel to use is up to the user: ASMOOTH offers a choice of Gaus- 
sian (default) and circular top-hat. 

The algorithm is coded such that the adaptively smoothed im- 
age is accumulated in discrete steps as the smoothing scale in- 
creases gradually, i.e. 



(10) 



where ai starts from an initial value ao which is matched to the 
intrinsic resolution of the raw image (i.e., the pixel size), and h(r) 
is given by 

( 1(f) where N mm < I'(r )//C(0, a t ) < N max 
Ii(r)= < mdl(f)glj(r)j<i (11) 

I elsewhere. 

The adaptively smoothed image is thus accumulated in a "top- 
down" fashion with respect to the observed intensities as ASMOOTH 
starts at small kernel sizes to smooth the vicinity of the bright- 
est pixels, and then increases the kernel size until, eventually, only 
background pixels contribute. Note that condition [H] ensures that 
pixels found to contain sufficient signal at a scale a* will not con- 
tribute to the image layers Ij , ( j > i) subsequently produced with 
smoothing scales aj > ai. Consequently, each feature is smoothed 
at the smallest scale at which it reaches the required background- 
corrected s.n.r. (see eq. |6j, and low-s.n.r. regions are smoothed at 
an appropriately large scale even in the immediate vicinity of image 
areas with very high s.n.r. 

In order to take full advantage of the resolution of the un- 
binned image, the size ao of the smallest kernel is chosen such 
that the area enclosed by /C(r, ao) is about one pixel. For the cir- 
cular top-hat filter of eq.|2|this means ao = 1/v^ f° r me Gaus- 
sian kernel of eq.|9|we have ao = 1/v / 9tt- Subsequent values of 
ai (i > 0) are determined from the requirement that eq.|6|be true. 
If a near-constant s.n.r. value is aimed at with high accuracy, i.e., if 
a T m ax value very close to Train IS chosen, the smoothing scale ai will 
grow in very small increments, and the smoothing will proceed only 
slowly. In all our applications we found values of 
to yield a good compromise between CPU time considerations and 
a sufficiently constant signal-to-noise ratio of the smoothed image. 
If no value for the optional input parameter Tmin IS supplied by the 
user, the code therefore assumes a default value of 
which meets the above requirement for all reasonable values of 

Tmin- 
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While the intrinsic resolution of the raw image (i.e. the pixel 
size) determines the smallest kernel size <to, the size of the im- 
age as a whole represents an upper limit to the size of the kernel. 
Although the convolution can be carried out until the numerical 
array representing the kernel is as large as the image itself, this 
process becomes very CPU time intensive as en increases. Once 
the smoothing scale has exceeded that of the largest structure in the 
image, the criterion of eq.[6|can never be met as only background 
pixels contribute. Since the only features left unsmoothed at this 
stage are insignificant background fluctuations, the algorithm then 
smoothes the remaining pixels with the largest possible kernel. Un- 
avoidably, the signal-to-noise ratio of these last background pixels 
to be smoothed does not meet the condition of eq. |6| Note that in 
the case of the background being determined locally from the data 
themselves (the default), it is the size of the background kernel (an 
annulus around the smoothing kernel) that reaches the limit first. 
Hence, for a Gaussian, the largest smoothing scale (la size of the 
kernel) is 1/8 of the image size, for a top-hat it is 3/8 of the image 
size. 

Since the algorithm uses fast-fourier transformation (FFT) to 
compute the many convolutions required, the overall performance 
of the code is significantly improved if the image size in pixels is an 
integer power of two. The smoothed image obtained from the above 
procedure strictly conserves total counts (within the limitations set 
by the computational accuracy) and provides a fair representation 
of the original data at all positions. 



4 PERFORMANCE OF THE ALGORITHM 

We first demonstrate the performance of our algorithm by applying 
it to an image of X-ray emission from a massive cluster of galax- 
ies taken with the ACIS-S detector on board the Chandra X-ray 
Observatory. Then we use simulated data to test how faithfully As- 
MOOTH reproduces the true count distribution of the input image. 

4.1 Results for Chandra ACIS-S data 

Because of the large range of scales at which features are detected 
in the selected Chandra observation, this X-ray image is ideally 
suited for a demonstration of the advantages of AKS. If photon 
noise is to be suppressed efficiently, the very extended emission 
from the gaseous intracluster medium needs to be smoothed at a 
rather large scale. At the same time, a small smoothing scale, or 
no smoothing at all, is required in high-s.n.r. regions in order to re- 
tain the spatial resolution in the vicinity of bright point-like sources 
(stars, QSOs, AGN) superimposed on the diffuse cluster emission. 
Fixed-kernel smoothing can meet, at most, one of these require- 
ments at a time. 

Our choice of data set has the additional advantage that the 
selected image was taken with an X-ray detector that is also very 
sensitive to charged particles which makes the image particularly 
well suited to emphasize the importance of a proper treatment of 
the background. 

Fig. [T] (top left panel) shows the raw counts detected with 
ACIS-S in the 0.5 to 7 keV energy band in an 89 ks observation 
of the galaxy cluster MS 1054.4—0321. The diffuse emission orig- 
inates from an electron-ion plasma trapped in the gravitational po- 
tential well of the cluster and heated to temperatures of typically 
1 x 10 8 K (corresponding to kT & 10 keV). A detailed analysis of 
this observation is given by Jeltema et al. (2001). The image shown 
here (512x512 pixel 2 ) covers a subsection of the detector spanning 



a 4.2 x 4.2 arcmin 2 region; the pixel size of 0.492 arcsec corre- 
sponds roughly to one on-axis resolution element of the telescope- 
detector configuration. Note the high background throughout the 
image, caused by high-energy cosmic rays. 

Fig.[T]summarizes ASMOOTH results obtained with a Gaussian 
kernel, and for s.n.r. target values of r m m — 2, 3, 4 and T max = 
Tmin + 1, in a three-by-three array of plots below the image of the 
raw data. 

4. 1. 1 ASMOOTHed images 

The left-most column of Fig.[T]shows the adaptively smoothed im- 
ages for the three different r m i n values. ASMOOTH fully preserves 
the high information content of the raw data in the high-s.n.r. re- 
gions corresponding to bright, small-scale features, while at the 
same time heavily smoothing the low-s.n.r. regions of the image 
where the signal approaches the background value. Note, however, 
that for small values of r m i n (top row) the goodness of the local es- 
timation of the signal above the background is relatively low and 
noise is not removed efficiently on all scales. 

Figure illustrates the gradual assembly of the adaptively 
smoothed image by showing the change in various ASMOOTH pa- 
rameters as a function of smoothing step. 

At the highest pixel intensities above the s.n.r. threshold, AS- 
MOOTH occasionally returns smoothed count values that exceed the 
counts in the unsmoothed image: the brightest point source in the 
raw data has a peak value of 126 counts, the ASMOOTHed image 
features a value of 128.5 at the same location. This is due to the 
fact that, although the corresponding pixels themselves remain es- 
sentially unsmoothed and thus keep their original values, there is an 
additional contribution from the larger sized kernels of neighbour- 
ing pixels whose s.n.r. above the background falls short of r m i n . 
Since the final image is accumulated from the partial images re- 
sulting from the successive convolution with the whole set of dif- 
ferently sized kernels (cf. Eq.llOl. the total smoothed intensity can 
become larger than the actually observed counts at pixel positions 
where 1(f) > N m [ n . This artifact is caused by the limited resolu- 
tion of the images and can be noticeable when a large bin size is 
chosen for the original data. The integral number of counts in the 
image is always conserved. 

4.1.2 Kernel size and s. n. r. maps 

As illustrated in Fig. |2| ASMOOTH applies a wide range of smooth- 
ing scales to the input data as the algorithm attempts to fulfill the 
requirement given by Eq.[6]throughout the image. In addition to the 
adaptively smoothed image, ASMOOTH also returns, in an IDL data 
structure, maps of the background-corrected s.n.r. of the pixel val- 
ues in the smoothed image and the kernel size used in the smooth- 
ing process, respectively, 

The second and third columns of the three-py-three panel in 
Fig. [I] show both of these maps for the ASMOOTH images shown 
in the first, left-most column of plots. The maps of ASMOOTH ker- 
nel sizes as well as Fig |2| demonstrate how very small kernel sizes 
of less than or about one pixel (la radius) are assigned to very 
few bright pixels; accordingly, these pixels remain essentially un- 
smoothed. For T m in = 3 (middle row), for instance, some 27 per 
cent of the image pixels are found to satisfy the criterion of Eq.[6|at 
smoothing scales between one and 62 pixels (radius), while the ma- 
jority of the remaining pixels (more than 72 per cent) do not contain 
enough signal to reach the required s.n.r. even at the largest permis- 
sible smoothing scale. The majority of the image is smoothed at 
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Figure 1. Left: X-ray emission detected with Chandra ACIS-S in the 0.5-7 keV band in a 4.2 x 4.2 
arcmin 2 field around the cluster of galaxies MS 1054.4—0321; the intensity scaling is logarithmic. Shown 
are the raw data. Below: Results obtained with ASMOOTH (adaptive Gaussian kernel) for (top to bottom 
row) T m i n = 2, 3, and 4. In all cases T max was set to r m i n + 1. The three columns show (from left to right) 
the adaptively smoothed image (same intensity scaling as used for the image of the raw data shown in the 
single panel at the very top), a map of the kernel sizes (la radius of a two-dimensional Gaussian) used 
by ASMOOTH in the smoothing of the raw data, and a map of the background-corrected s.n.r. of pixel 
values in the adaptively smoothed image. Note how ASMOOTH fully retains signal that reaches the spec- 
ified background-corrected s.n.r. level while, at the same time, heavily suppressing background noise by 
applying a wide range of smoothing scales from much less than one to many tens of pixels. The major- 
ity of the pixels in the raw data contain insufficient signal to reach the specified s.n.r. threshold at any 
permissible smoothing scale and are thus smoothed with the largest possible kernel. Note the correspon- 
dence between the outlines of regions with a signal-to-noise ratio of less than Tmin in the ASMOOTHed 
images (right column) and the outlines of regions smoothed with largest possible kernel (center column). 
With the exception of a few very bright pixels whose s.n.r. exceeds r m ax in the raw data (i.e. without any 

smoothing all other nivek in the A SMOOTHpH image, rontain sicmal whose, harkcrronnd-rorrerted stir, 
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Figure 2. As MOOT h processing parameters as a function of smoothing step 
for one of the examples shown in the Fig0(T mm — 3, T m ax 

= 4). The 

shaded region and solid black line delineate the range of signal-to-noise 
ratios and their median value, respectively, for the set of image pixels pro- 
cessed in each step. The target values r m i n and T max are marked by the 
dashed horizontal lines. Note the nonlinear increase in the smoothing ra- 
dius cr (green line), the cumulative fraction of pixels processed (red line), 
and the cumulative fraction of counts processed. Note also that the criterion 
of eq.|6|is a soft one as far as N max (r max ) is concerned: r values greater 
than r max are tolerated as long as the median value of the s.n.r. distribution 
for each smoothing step is smaller than (r m ax + T m i n )/2. S.n.r. values 
exceeding r m ax occur also at the smallest smoothing scales as the code is 
designed not to blur features whose s.n.r. is higher than r m j n already in the 
raw data. 

the largest possible scale of 63.8 pixels at which the dimensions 
of the background kernel array (a la wide annulus surrounding 
the two-dimensional Gaussian smoothing kernel computed out to 
3cr radius) equals that of the image itself. The s.n.r. maps (Fig.[l] 
right-most column), finally, give evidence of the near-constant s.n.r. 
of all regions smoothed with kernels other than the largest one. 

4.1.3 Statistical properties of the residual image 

The qualitative demonstration of the performance of A SMOOTH 
shown in Fig.[l]can be made more quantitative by comparing it with 
the results obtained with fixed-kernel smoothing. To this end we 
examine the properties of the residual images obtained by subtract- 
ing the respective smoothed image from the observed raw image. 
Following Pina & Puetter (1992) who introduced this criterion in 
the context of Bayesian image reconstruction, we state that, for an 
ideal smoothing algorithm, this residual image should contain only 
uncorrelated Poissonian noise around a zero mean. A global mean 
of zero is guaranteed - within the numerical accuracy of the con- 
volution code - by the requirement that any smoothing algorithm 
conserve counts. The requirements that the mean also be zero lo- 
cally, that the residual signal have Poissonian variance, and that the 
residual signal be uncorrelated across the image are much harder 
to meet. In the following we shall examine how well adaptive and 
fixed kernel smoothing fulfill these requirements. 

Is the residual signal consistent with Poissonian noise shifted 
to zero mean as expected if shot noise dominates? 

If so, the square of the noise (given by the residual sig- 
nal), n^es, should equal the mean (given by the smoothed signal 
^smooth). Since the smoothed image contains a wide range of mean 
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Figure 3. The ratio of the variance of the residual signal to the mean (as 
given by the smoothed signal) as a function of the smoothed signal for 
the example shown in Fig. [l] Only bins containing at least 5 pixels are 
shown. Also shown are Gaussian errors based on the number of pixels per 
bin. For a perfect smoothing algorithm the plotted ratio is unity at all val- 
ues of the mean. The bold lines (red, green, and blue) representing the re- 
sults obtained with ASMOOTH for r m j n = 2, 3, and 4 come close to the 
ideal of a constant value of unity for most values of the mean. Only a few 
bright pixels exhibit significantly higher variance than expected for Pois- 
sonian statistics. By comparison, fixed kernel smoothing (see the fine solid 
lines) results in far too high variances for all but the smallest kernel size. 
Strong deviations from Poissonian statistics are observed over large por- 
tions of the image. The chosen fixed kernel sizes assume the values of the 
l/100 th , 3/100 th , l/10 th , 1 st , and 10 th percentile of the distribution of 
ASMOOTH kernel sizes for r min = 3 (1.1, 2.5, 6.7, 11.4, and 21.9 pixels). 



values, we can test for this condition only within bins of simi- 
lar mean. Figure [3] shows the ratio nr es /n sm0 oth as a function of 
^smooth for both ASMOOTH and a number of fixed kernels of var- 
ious sizes. Note how the residual signal obtained with ASMOOTH 
exhibits a near-Poissonian variance over a larger range of mean val- 
ues than does the residual produced by smoothing with a fixed ker- 
nel of essentially any size. Only the smallest fixed kernel size tested 
here, a c = 1.1 pixels, yields comparable results — a kernel of such 
small size, however, also provides essentially no smoothing. 

Is the signal in the residual image spatially uncorrelated? 

Figure|4]shows the autocorrelation function of the residual im- 
ages obtained with ASMOOTH and fixed kernels of various sizes. 
Here we use the standard definition of the autocorrelation £ as a 
function of radial lag r: 

«r) = <WO W?' + 0) _ j (12) 

\^res / 

where the angular brackets signify averaging over the position r ' 
within the image (e.g. Peebles 1980). In the normalized represen- 
tation of £(r) shown in Fig.|4|an ideal smoothing algorithm would 
produce zero signal at all lags except for a singular value of unity at 
r — 0. ASMOOTH comes close to this ideal: the signal in the resid- 
ual images generated with ASMOOTH is essentially uncorrelated at 
all scales except for a weak (< 2%) positive signal at scales smaller 
than about the maximum adaptive kernel size. Note the very dif- 
ferent result when fixed Gaussian kernels of size cr CO nst are used: 
strong spatial correlations are observed in the residual images at 
all scales smaller than about the kernel size. Only for very small 
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Figure 4. The normalized autocorrelation function (cf. Eq.I121 of the signal 
in the residual images obtained by subtracting the smoothed image from 
the original one for the example shown in Fig.[l] For a perfect smoothing 
algorithm the autocorrelation signal is zero at all lags r > 0. The almost 
indistuinguishable bold lines (red, green, and blue) representing the results 
obtained with ASMOOTH for r m i n = 2,3, and 4 come close to this ideal at 
the price of a slight positive correlation on all scales. Fixed kernel smooth- 
ing always results in significant spatial correlations. The chosen fixed kernel 
sizes assume the values of the l/100 th , 3/100 th , l/10 th , 1 st , and 10 th 
percentile of the distribution of ASMOOTH kernel sizes for r m j n = 3 (1.1, 
2.5,6.7, 11.4, and 21.9 pixels). 



kernel sizes does non-adaptive smoothing come close to meeting 
the requirement that any residual signal after smoothing be spa- 
tially uncorrelated. However, such small kernel sizes perform very 
poorly in the presence of significant structure on a large range of 
scales (cf. Fig. [I] center column). 



4.2 Simulated data 

The example shown in the previous section has the advantage of 
using real data but, for this very reason, does not allow the user 
to assess quantitatively how the ASMOOTHed image compares to 
the true counts distribution underlying the noisy input image. We 
therefore present in this section results obtained for a simulated 
data set that contains multiple extended and point-like features, as 
well as a constant background component. 

Figure [5] summarizes the characteristics of the model used for 
our simulation (top row). The same figure shows, in the bottom row, 
results obtained with ASMOOTH for r m i n = 3, r ma x = T m i n + 1, 
and with the adaptive binning algorithm WVT (Diehl & Statler 
2005) for a target s.n.r. value of 5. Note how the requirement of 
a minimal s.n.r. above the local background (Eqs. 4 and 5) causes 
ASMOOTH to reduce noise much more aggressively than WVT in 
spite of a nominally lower s.n.r. target value. As demonstrated by 
the error distribution depicted in the final panel of Fig. [5] the AS- 
MOOTH residual image exhibits a near Gaussian error distribution 
around zero mean, in addition to featuring Poissonian variance and 
being spatially uncorrelated (Section[I). ASMOOTHed images can 
thus be considered to be fair representations of the 'true' input im- 
age. 



5 SUMMARY 

We describe ASMOOTH, an efficient algorithm for adaptive kernel 
smoothing of two-dimensional image data. ASMOOTH determines 
the local size of the kernel from the requirement that, at each po- 
sition within the image, the signal-to-noise ratio of the counts un- 
der the kernel, and above the background, must reach (but not ex- 
ceed greatly) a certain preset minimum. Qualitatively, this could 
be called the 'uniform significance approach'. As a consequence 
of this criterion, noise is heavily suppressed and real structure is 
enhanced without loss of spatial resolution. Due to the choice of 
boundary conditions, the algorithm preserves the total number of 
counts in the raw data. The ASMOOTHed image is a fair repre- 
sentation of the input data in the sense that the residual image is 
consistent with pure noise, i.e., it the residual possesses Poissonian 
variance and is spatially uncorrelated. As demonstrated by the re- 
sults obtained for simulated input, the adaptively smoothed images 
created by A SMOOTH are fair representations of the true counts 
distribution underlying the noisy input data. 

Since the background is accounted for in the assessment of 
any structure's s.n.r., a feature that distinguishes ASMOOTH from 
most other adaptive smoothing or adaptive binning algorithms, the 
interpretation of the adaptively smoothed image is straightforward: 
all features in the ASMOOTHed image are equally significant at the 
scale of the kernel used in the smoothing. A map of these smooth- 
ing scales is returned by ASMOOTH together with the smoothed 
image. Note that these are the smallest scales at which real features 
reach the selected s.n.r. threshold. Consequently, background re- 
gions never meet the s.n.r. criterion and are smoothed at the largest 
possible scale for which the kernel size equals the size of the im- 
age. Such regions of insufficient signal are easily flagged using an 
s.n.r. map which is also provided by our algorithm. 

We emphasize that our definition of "signal" as meaning "sig- 
nal above the (local) background' was chosen because many, if 
not most, real-life appplications do not provide the user with a pri- 
ori knowledge of the intensity of the background or any spatial 
background variations. Also, superpositions of features (such as the 
point sources on top of diffuse emission in our Chandra ACIS-S 
example) require a local background estimate if the intrinsic sig- 
nal of features on very different scales is to be assessed reliably. 
The approach taken by ASMOOTH is thus intrinsically very differ- 
ent from the one implemented in, for instance, the adaptive binning 
algorithm WVT (Diehl & Statler 2005) and the difference in the 
resulting output is accordingly dramatic (see Section l4!2l . 

ASMOOTH is being used extensively in the analysis of as- 
tronomical X-ray imaging data gathered in a wide range of mis- 
sions from ROSAT to XMM-Newton, and has become the anal- 
ysis tool of choice for the display of high-resolution images ob- 
tained with the Chandra X-ray observatory. Recent examples illus- 
trating ASMOOTH's performance can be found in Ebeling, Mendes 
de Oliveira & White (1995), Brandt, Halpern & Iwasawa (1996), 
Hamana et al. (1997), Ebeling et al. (2000), Fabbiano, Zezas & 
Murray (2001), Krishnamurthi et al. (2001), Karovska et al. (2002), 
Bauer et al. (2002), Gil-Merino & Schindler (2003), Heike et al. 
(2003), Rasmussen, Stevens & Ponman (2004), Clarke, Blanton 

6 Sarazin (2004), Ebeling, Barrett & Donovan (2004), or Pratt 
& Arnaud (2005), as well as in many Chandra press releases 
(http : //chandra .harvard, edu/press/). 

Under development is an improved version of the algorithm 
which accounts for Poisson statistics using the analytic approxi- 
mations of Ebeling (2003) to allow proper treatment of significant 
negative features, such as absorbed regions, detector chip gaps, 
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Figure 5. Top row. Left: The model used in our simulation, a combination of two extended, four compact features, and a spatially invariant background 
component, designed to mimic an astrophysical X-ray image of point sources superimposed on asymmetric diffuse emission from, e.g. a cluster of galaxies. 
Center: Iso-intensity contours for the same model. Right: input image obtained by applying Poisson noise to the model. Bottom row. Left: Iso-intensity 
contours of the ASMOOTHed image obtained with a Gaussian kernel and r m i n = 3 - the contour levels are the same as above for the model image. Center: 
Iso-intensity contours of an adaptively binned image as obtained with the WVT algorithm (Diehl & Statler 2005) for an s.n.r. target value of 5 - the contour 
levels are again the same as for the model image. Note that, unlike ASMOOTH, WVT does not correct for a local background in its s.n.r. estimation. Right: 
Histogram of relative errors, defined as (result - model)/model, in percent for ASMOOTH (solid orange fill) and the WVT adaptive binning code (hatched). 



or instrument elements (e.g., the window support structure of the 
ROSAT PSPC) obscuring part of the image. 

ASMOOTH is written in the IDL programming language. The 
source code is available on request from ebeling@ifa.hawaii.edu. 
A C + + version of an early version of the code, called CSMOOTH, 
is part of CIAO, the official suite of data analysis tools for the 
Chandra X-ray observatory, and is not identical to the algorithm 
described here. 
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